Comparing count-based and band-based indices of word frequency: Implications for active vocabulary research and pedagogical applications

نویسندگان

  • Scott A. Crossley
  • Tom Cobb
  • Danielle S. McNamara
چکیده

In assessments of second language (L2) writing, quality of lexis typically claims more variance than other factors, and the most readily operationalized measure of lexical quality is word frequency. This study compares two methods of automatically assessing word frequency in learner productions. The first method, a band-based method, involves lexical frequency profiling, a procedure that first groups individual words into families and then sorts these into corpus-based frequency bands. The second method, a countbased method, assigns a normalized corpus frequency count to each individual word form used, yielding an average count for a text. Both band and count-based methods were used to analyze 100 L2 learner and 30 native speaker freewrites that had been classified according to proficiency level (i.e., native speakers and beginning, intermediate and advanced L2 learners). Machine learning algorithms were used to classify the texts into their respective proficiency levels with results indicating that count-based word frequency indices accurately classified 58% of the texts while band-based indices reported accuracies that were between 10% and 22% lower than count-based indices. 2013 Published by Elsevier Ltd.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Developing a Corpus-Based Word List in Pharmacy Research ‎Articles: A Focus on Academic Culture

The present corpus-based lexical study reports the development of a Pharmacy Academic Word List (PAWL); a list of the most frequent words from a corpus of 3,458,445 tokens made up of 800 most recent pharmacy texts including research articles, review articles, and short communications in four sub-disciplines of pharmacy. WordSmith (Scott, 2017) and AntWordProfiler (Anthony, 2014) were used to sc...

متن کامل

High- and Mid-Frequency Vocabulary Size as Predictors of Iranian University EFL Students’ Speaking Performance

Literature is replete with the studies focusing on the role of vocabulary knowledge in second language receptive skills. However, the relationship between the aspects of vocabulary knowledge and productive skills in general, and the speaking performance in particular has remained scanty in the related literature. This paper examined the relationship between knowledge of L2 vocabulary size at di...

متن کامل

Do We Need Discipline-Specific Academic Word Lists? Linguistics Academic Word List (LAWL)

This corpus-based study aimed at exploring the most frequently-used academic words in linguistics and compare the wordlist with the distribution of high frequency words in Coxhead’s Academic Word List (AWL) and West’s General Service List (GSL) to examine their coverage within the linguistics corpus. To this end, a corpus of 700 linguistics research articles (LRAC), consisting of approximately ...

متن کامل

The Vocabulary Profile of Iranian English Teaching School books

This paper provides a fairly detailed corpus-based vocabulary profile of the Iranian EFL books used in public schools. To this end, the WordPerfect files of all the seven books were converted to text format to get rid of the formatting features and be compatible with the software used for analysis. The software tools used were the Compleat Lexical Tutor suite, version 6.2 (Cobb, 2011), AntConc ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013